-
Notifications
You must be signed in to change notification settings - Fork 151
Guest tracing improvements to use tracing crate
#844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tracing crate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a high level review, and what I've seen looks good :-)
d4327a8 to
ccaa14e
Compare
ccaa14e to
cceb069
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round review looks good to me. I am curious what the performance looks like with tracing vs without, maybe we could add a benchmark or something for this?
Also, is there any possibility that we don't flush spans/records after exiting the guest, and that some end up not being emitted?
Another thing to consider is log crate vs tracing crate. Should we ditch one? Or is there some mechanism that allows regular logs to be consumed by tracing crate: And should we expose any of these in guest C-api?
Also is the tracing buffer sizes configurable? Maybe it should be if it isn't, so users can tweak it in case it affects performance.
+1 |
Ok, I can do that.
Hmm, in my limited testing I haven't seen this case, but I wouldn't exclude the possibility.
I am not sure about the best approach is.
The tracing buffer is compile time configurable, which I agree is not ideal for customers. |
77bbba5 to
6d10d2e
Compare
6d10d2e to
8ff5a1e
Compare
I've run some benchmarks locally and here are the results.
Runtimes:
Some thoughts:
|
8ff5a1e to
f00d63e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me in general, but I think maybe we should consider what tunables we need
f00d63e to
148470a
Compare
148470a to
908f482
Compare
I see this version using opentelemetry as an incremental step, after which additional tunables can be added to improve usability. Improvements:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really like the idea of using the industry standard solution for tracing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a couple of comments , I don't think that any of these are things that need to be addressed here but we should consider as future enhancements
6468526 to
0e6f377
Compare
e1c72e2 to
dfa8cb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a big fan of this! Btw what happens when buffer in guest is full and an event/span comes in? Also question: is the parent=Span::current required inside all the
#[instrument(skip_all, parent = Span::current(), level= "Trace")]? Or is it the default
|
The buffer length is verified every time after a new event/span is pushed. So if it got full, it is emptied afterwards. The ´parent = Span::current()´ is not needed. I am planning to change that afterwards with some other small things. |
- This feature is not used separate from the mem_profile - All the unwind logic is now gated by mem_profile Signed-off-by: Doru Blânzeanu <[email protected]>
- The guest side does not use this type of OutBAction - The stack unwinding is done either way when the mem_profile feature is enabled Signed-off-by: Doru Blânzeanu <[email protected]>
- This helps with keeping code separate and easily gating it out Signed-off-by: Doru Blânzeanu <[email protected]>
- This steps cleans up codebase for the new way of tracing guests - The current method involves custom macros and logic that are not the best for maintainability Signed-off-by: Doru Blânzeanu <[email protected]>
- Define a separate struct that holds the functionality related to memory profiling of the guest Signed-off-by: Doru Blânzeanu <[email protected]>
- Rename TraceInfo to reflect only being used by mem_profile Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
- Adds a type that implements the Subscriber trait of the tracing_core crate that allows the type to be set as the global Subscriber of the crate - This way we can handle the adding of new spans and events and store them where/how we want Signed-off-by: Doru Blânzeanu <[email protected]>
- implement add_span and event methods that store the info and report it to the host when the buffer gets full Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
- Parse the spans and events coming from the guest and create corresponding spans and events from the host that mimics a single call from host - Create a `TraceContext` that handles a call into a guest Signed-off-by: Doru Blânzeanu <[email protected]>
- conditionally handle logs either through tracing or the dedicated VM exit based on whether tracing is initialized on the guest - modify `test_with_small_stack_and_heap` to 18kB because the `#[intrument]` attributes use more stack. Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
dfa8cb0 to
1328218
Compare
Description
This PR closes #723, #704 and partially addresses #318.
These changes modify the way we perform guest tracing to use the
tracingcrate and its macros (instrument, trace).How it works
Guest
What makes this possible is the implementation of the
Subscribertrait in thehyperlight-guest-tracingcrate. By implementing it, we can now handle the capturing of spans and events and choose how to store them and when to export them to the host.The
GuestSubscribertype that implementsSubscriberkeeps an internalTraceStatethat holds all the needed information.Whenever a new span is created, entered or exited, a callback on the subscriber is called so that we can handle the functionality. The same happens with the events also.
Each time a new span or event is added to the internal state, we check whether the buffer got full and send them to the host to process.
Host
When the host detects a VM exit from the guest, it checks whether it contains tracing information in the
OutBinstruction.When tracing information is found, the host starts going through it and check against the local storage of spans.
The spans parents are set based on the information got from the host.
TODO
These do not correctly set the parents of the spans created in the host to the last one created in the guest before doing the VM exit
I need to find a way to propagate the context into the guest and back whenever it is needed. But using the Opentelemetry propagators needs
stdsupport which we do not have in the guest.Jaeger picture of a Guest call that calls back into the host